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antibodies, that specifically bind to the protein. Such specific binding agents may be used to detect 
and quantify the presence of the 15 kDa selenoprotein in biological samples, and may be used in 
methods for detecting susceptibility to, or the presence of, cancer or monitoring the progression of 
the cancerous state. 

5 Also provided by the invention is a nucleic acid molecule encoding the 15 kDa 

selenoprotein, as well as probes and primers that are useful to detect and quantify the nucleic acid 
molecule. Probes and primers that are useful to detect polymorphisms in the cDNA sequence and the 
gene corresponding to the 15 kDa selenoprotein are also disclosed. Probes and primers that are 
useful to determine the genotype of an individual's 15 kDa selenoprotein are also disclosed. The 

10 detection of polymorphisms in the 15 kDa selenoprotein cDNA or gene, and the determination of an 
individual's genotype, may be used to determine the susceptibility of an individual to cancer, 
including prostate cancer. 

In other embodiments, the invention also provides compositions and methods useful to 
determine the effect of chemical and biological agents (such as candidate tumor therapeutics) on the 

15 expression of the 15 kDa selenoprotein. In one such embodiment, the effect of exposing cells to the 
candidate agent is assessed by measuring the change in expression levels of the 15 kDa selenoprotein 
mRNA or protein within the cell after exposure to the agent. Such methods may be used to identify 
agents that have beneficial effects in the treatment or prevention of cancer, including prostate cancer. 

20 BRIEF DESCRIPTION OF THE DRAWINGS 

FICjNl shows the human cDNA sequence encoding the 15 kDa selenoprotein and the amino 
acid sequence of the^selenoprotein itself. In the deduced amino acid sequence, the putative signal 
peptide is shown in lower case and the most probable site of post-translational cleavage is indicated 
25 by an upward arrow. The amino acid U represents selenocysteine 93 encoded by an in-frame TGA 
codon (overlined). The sequences of four tryptic peptides, for which amino acid sequences were 
experimentally determined, are underlined. In the 3'-UTR, the positions of the selenocysteine 
insertion sequence (SECIS element) and the poly-A addition signal (dotted underline) are shown. 

FIG. 2 shows alignment of the human 1 5 kDa selenoprotein sequence with homologs from 
mouse, nematodes and rice. 

FIGS. 3 A and 3B relate to the SECIS elemeV FIG. 3A shows the general features of 
eukaryotic SECIS elements used to identify a matching element in the 3'-UTRs of the mRNAs 
encoding human and mouse 15 kDa selenoproteins. FIG. 3Bvshows an alignment of the predicted 
SECIS elements of the human and mouse mRNAs encoding the\5 kDa selenoprotein with a typical 
35 experimentally verified example (human GPX-1). In helical stems, smgle base bulges or mismatches 
are shown by gaps in the arrows. A lower case "a" residue above the hurnjan apical loop sequence 
indicates a polymorphism at position 1 125. 
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FIG. 4 is a digital image of a Western blot showing the detection of the 15 kDa 
selenoprotein in cancerous and non-cancerous mouse liver tissues. 

FIG. 5 is a digital image of a Western blot showing the detection of the 15 kDa 
selenoprotein in mouse cancerous and non-cancerous liver and prostate tissues. 
5 FIG. 6 is a representative drawing showing the structure of the human 15 kDa selenoprotein 

cDNA. The C/T and G/A polymorphisms at nucleotide positions 81 1 and 1 125 respectively, are 
shown. 

FIG. 7 is a digital image showing the use of primer extension (A) and restriction digestion 
(B and C for the detection of polymorphisms, to determine an individual's genotype. 

10 FIG. 8A is a digital image showing the expression of recombinant forms of the 1 5 kDa 

selenoprotein, with Coomassie Blue staining showing the overexpression of the His-tag cysteine-for- 
selenocysteine mutant form of the 15 kDa selenoprotein. FIG. 8B is a digital image showing 
expression of the His-tag selenocysteine-containing form of the 15 kDa selenoprotein. Lanes 1-3: 15 
kDa selenoprotein cDNA; lanes 4-9: selenocysteine insertion sequence elements constructed 

15 downstream of TGA encoding selenocysteine (see FIG. 9 B and C). Selenium-containing proteins 
were detected by metabolic labeling with 7S Se and visualized with a Phosphorlmager. 

FIG. 9 shows the bacterial selenocysteine insertion sequence elements. These structures 
show the formate dehydrogenase H selenocysteine insertion sequence element (A) and two 
selenocysteine insertion sequence elements (B and C) designed downstream of the TGA codon 

20 encoding selenocysteine in the 15 kDa selenoprotein gene. The minimal essential structure necessary 
for selenocysteine incorporation is boxed. 5 '-end UGA encodes selenocysteine in these three 
constructs. 



SEQUENCE LISTING 

25 The nucleic and amino acid sequences listed in the accompanying sequence listing are 

shown using standard letter abbreviations for nucleotide bases, and three letter code for amino acids. 
In those sequence listings showing amino acid sequences, selenocysteine is represented by Xaa. 
Only one strand of each nucleic acid sequence is shown, but the complementary strand is understood 
as included by any reference to the displayed strand. 

30 

Seq. I.D. No. 1 shows the amino acid sequence of the human 15kDa selenoprotein. 
Seq. I.D. No. 2 shows the nucleic acid sequence of the human 15kDa selenoprotein cDNA. 
Seq. I.D. No. 3 shows the nucleic acid sequence of the ORF of the human 1 5kDa 
selenoprotein cDNA. 

35 Seq. I.D. No. 4 shows the amino acid sequence of the putative mature form of the human 15 

kDa selenoprotein after post-translational modification. 
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Seq. LD. Nos. 5-7 show examples of primers that may be used to amplify portions of the 
human 15 kDa selenoprotein cDNA. 

Seq. I.D. No. 8 shows the nucleic acid sequence of the mouse 15 kDa selenoprotein cDNA. 

Seq. I.D. No. 9 shows the amino acid sequence of the mouse 15 kDa selenoprotein. 
5 Seq. I.D. Nos. 10 and 1 1 show examples of primers that may be used to amplify portions of 

the mouse 15 kDa selenoprotein cDNA. 

Seq. I.D. Nos. 12 and 13 show examples of primers that may be used to amplify the 
polymorphism containing region of human 15 kDa selenoprotein cDNA. 

Seq. I.D. No. 14 shows a primer that can be used to determine the nucleotide at position 8 1 1 
10 using primer extension. 

Seq. I.D. No. 15 shows a primer that can be used to determine the nucleotide at position 
1 125 using primer extension. 

DETAILED DESCRIPTION OF THE INVENTION 

1 15 

I. Abbreviations and Definitions 

The following abbreviations and definitions are used herein. 

Sec - selenocysteine 
20 IPTG - isopropyl (3-D-thiogalactopyranoside 

ORF - open reading frame 

EST - expressed sequence tag 

dbEST - database of expressed sequence tags 

MALDI - matrix assisted laser desorption ionization 
25 3'-UTR - 3' untranslated region 

SECIS element - selenocysteine insertion sequence element 

CGAP - Cancer Gene Anatomy Project 

GPX - glutathione peroxidase 

TFA - trifluoroacetic acid 

30 

cDNA (complementary DNA): A piece of DNA lacking internal, non-coding segments 
(introns). cDNA is synthesized in the laboratory by reverse transcription from messenger RNA 
extracted from cells. 

15 kDa selenoprotein: A mammalian protein of approximate molecular weight 15 kDa that 
35 contains a selenocysteine residue encoded in the corresponding gene sequence by the codon UGA. 
Levels of the 1 5 kDa selenoprotein are reduced in certain types of tumor cells, such as prostate 
cancer cells. The present invention discloses the sequences of the human and mouse 15 kDa 
selenoproteins and their corresponding cDNAs. The term " 1 5 kDa selenoprotein" refers generically 
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to mammalian 15 kDa selenoproteins; the specific human or murine forms are herein referred to as 
the "human 15 kDa selenoprotein" and the "murine" or "mouse 15 kDa selenoprotein." Mammalian 
15 kDa selenoprotein polypeptides and cDNAs are orthologs of the disclosed murine and human 15 
kDa sequences and are thus structurally related by the possession of similar amino acid and nucleic 
5 acid structures. Typically, mammalian 15 kDa selenoprotein polypeptide sequences are 

characterized by possession of at least 70% amino acid sequence identity to the human 15 kDa 
selenoprotein amino acid sequence, determined using the BLAST program as described below. 

Sequence identity: the relatedness of two nucleic acid sequences, or two amino acid 
sequences is typically expressed in terms of the identity between the sequences (in the case of amino 

10 acid sequences, similarity is an alternative assessment). Sequence identity is frequently measured in 
terms of percentage identity; the higher the percentage, the more similar are the two sequences. 
Homoiogs of the human and mouse 15 kDa selenoproteins will possess a relatively high degree of 
sequence identity when aligned using standard methods. 

Methods of alignment of sequences for comparison are well known in the art. Various 

15 programs and alignment algorithms are described in: Smith and Waterman (Adv. Appl. Math. 2:482, 
1981); Needleman and Wunsch (J. Moi Biol. 48:443, 1970); Pearson and Lipman (Proc. Natl. Acad. 
Sci. USA 85:2444, 1988); Higgins and Sharp (Gene 73:237-44, 1988); Higgins and Sharp (CABIOS 
5:151-3, 1989); Corpetetal. (Nuc. Acid. Res. 16:10881-90, 1988); Huang et al. (Comp uter 
Applications in the Biosciences 8:155-65, 1992); and Pearson et al. (Meth. Moi Biol 24:307-31, 

20 1994). Altschul et al. (Nature Genet. 6:1 19-29, 1994) presents a detailed consideration of sequence 
alignment methods and homology calculations. 

The NCBI Basic Local Alignment Search Tool (BLAST) (Altschul et al., 1990, J. Moi Biol. 
215:403-10) is available from several sources, including the National Center for Biotechnology 
Information (NCBI, Bethesda, MD) and on the Internet, for use in connection with the sequence 

25 analysis programs blastp, blastn, blastx, tblastn and tblastx. It can be accessed at 

http://www.ncbi.nIm.nih.gov/BLAST/. A description of how to determine sequence identity using 
this program is available at http://www.ncbi.nlm.nih.gov/BLAST/blast_heIp.htmi. 

Homoiogs of the disclosed 15 kDa selenoprotein are typically characterized by possession 
of at least 70% sequence identity counted over the full length alignment with the amino acid 

30 sequence of a selected transcription factor using the NCBI Blast 2.0, Basic BLAST search, gapped 
blastp program set to default parameters (BLOSUM62 matrix; Gap existence costal 1 ; Per residue 
gap costal; lambda ratio=0.85). Proteins with even greater similarity to the reference sequences will 
show increasing percentage identities when assessed by this method, such as at least 75%, at least 
80%, at least 90% or at least 95% sequence identity. When less than the entire sequence is being 

35 compared for sequence identity, homoiogs will typically possess at least 75% sequence identity over 
short windows of 10-20 amino acids, and may possess sequence identities of at least 85% or at least 
90% or 95% depending on their similarity to the reference sequence. Methods for determining 
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enzyme. Primer pairs can be used for amplification of a nucleic acid sequence, e.g., by the 
polymerase chain reaction (PCR) or other nucleic-acid amplification methods known in the art. 

Methods for preparing and using probes and primers are described, for example, in 
Sambrook et al. (In Molecular Cloning: A Laboratory Manual, Cold Spring Harbor, New York, 
5 1989), Ausubel et al. (In: Current Protocols in Molecular Biology, Greene Publishing Associates and 
Wiley-Intersciences, 1987), and Innis et al., (PCR Protocols, A Guide to Methods and Applications, 
Innis et al. (eds.), Academic Press, Inc., San Diego, California, 1990). PCR primer pairs can be 
derived from a known sequence, for example, by using computer programs intended for that purpose 
such as Primer (Version 0.5, © 1991, Whitehead Institute for Biomedical Research, Cambridge, 
10 MA). One of skill in the art will appreciate that the specificity of a particular probe or primer 

increases with its length. Thus, for example, a primer comprising 20 consecutive nucleotides of the 
cDNA disclosed in Seq. I.D. No. 2 will anneal to a target sequence such as a homologous sequence in 
Q rat contained within a rat cDNA library with a higher specificity than a corresponding primer of only 

^ 15 nucleotides. Thus, in order to obtain greater specificity, probes and primers may be selected that 

SI 15 comprise 20, 25, 30, 35, 40, 50, 75, 100 or more consecutive nucleotides of the 15 kDa selenoprotein 

^ \ cDNA or gene sequences. 

I^l The invention thus includes isolated nucleic acid molecules that comprise specified lengths 

m of the disclosed transcription factor cDNA sequences. Such molecules may comprise at least 8-10, 

15, 20, 25, 30, 35, 40, 50, 75, or 100 consecutive nucleotides of these sequences and may be obtained 
4l 20 from anv region of the disclosed sequences. By way of example, the human and mouse 15 kDa 

selenoprotein cDNAs shown in the Sequence Listing may be apportioned into halves or quarters 
Q based on sequence length, and the isolated nucleic acid molecules may be derived from the first or 

second halves of the molecules, or any of the four quarters. The human 1 5 kDa selenoprotein cDNA, 
shown in Seq. I.D. No. 2 may be used to illustrate this. This cDNA is 1244 nucleotides in length and 
25 so may be hypothetically divided into halves (nucleotides 1-622 and 623-1244) or quarters 

(nucleotides 1-3 11, 3 12-622, 623-933 and 934-1244). Nucleic acid molecules may be selected that 
comprise at least 8-10, 15, 20, 25, 30, 35, 40, 50, 75 or 100 consecutive nucleotides of any of these 
portions of the transcription factor cDNA. Thus, one such nucleic acid molecule might comprise at 
least 25 consecutive nucleotides of the region comprising nucleotides 1 - 1 244 of the disclosed 
30 transcription factor cDNA. 

Transformed: A transformed cell is a cell into which has been introduced a nucleic acid 
molecule by molecular biology techniques. As used herein, the term transformation encompasses all 
techniques by which a nucleic acid molecule might be introduced into such a cell, including 
transfection with viral vectors, transformation with plasmid vectors, and introduction of naked DNA 
35 by electroporation, lipofection, and particle gun acceleration. 

Vector: A nucleic acid molecule as introduced into a host cell, thereby producing a 
transformed host cell. A vector may include nucleic acid sequences that permit it to replicate in the 
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host cell, such as an origin of replication. A vector may also include one or more selectable marker 
genes and other genetic elements known in the art. 

Isolated: An "isolated" biological component (such as a nucleic acid or protein) has been 
substantially separated or purified away from other biological components in the cell of the organism 
5 in which the component naturally occurs, i.e., other chromosomal and extrachromosomal DNA and 
RNA and proteins. Nucleic acids and proteins which have been "isolated" thus include nucleic acids 
and proteins purified by standard purification methods. The term also embraces nucleic acids and 
proteins prepared by recombinant expression in a host cell as well as chemically synthesized nucleic 
acids. 

10 Purified: The term purified does not require absolute purity; rather, it is intended as a 

relative term. Thus, for example, a purified 15 kDa selenoprotein preparation is one in which the 15 
kDa selenoprotein is more enriched than the protein is in its natural environment within a cell. 
Preferably, a preparation of 15 kDa selenoprotein is purified such that the 15 kDa selenoprotein 
represents at least 50% of the total protein content of the preparation. 

15 Oligonucleotide: A linear polynucleotide sequence of up to about 100 nucleotide bases in 

length. 

ORF (open reading frame): A series of nucleotide triplets (codons) coding for amino 
acids without any termination codons. These sequences are usually translatable into a peptide. 

Operably linked: A first nucleic acid sequence is operably linked with a second nucleic 
20 acid sequence when the first nucleic acid sequence is placed in a functional relationship with the 

second nucleic acid sequence. For instance, a promoter is operably linked to a coding sequence if the 
promoter affects the transcription or expression of the coding sequence. Generally, operably linked 
DNA sequences are contiguous and, where necessary to join two protein coding regions, in the same 
reading frame. 

25 Pharmaceutically acceptable carriers: The pharmaceutically acceptable carriers 

useful in this invention are conventional. Remington's Pharmaceutical Sciences, by E. W, Martin, 
Mack Publishing Co., Easton, PA, 15th Edition (1975), describes compositions and formulations 
suitable for pharmaceutical delivery of the fusion proteins herein disclosed. 

In general, the nature of the carrier will depend on the particular mode of administration 

30 being employed. For instance, parenteral formulations usually comprise injectable fluids that include 
pharmaceutically and physiologically acceptable fluids such as water, physiological saline, balanced 
salt solutions, aqueous dextrose, glycerol or the like as a vehicle. For solid compositions (e.g., 
powder, pill, tablet, or capsule forms), conventional non-toxic solid carriers can include, for example, 
pharmaceutical grades of mannitol, lactose, starch, or magnesium stearate. In addition to 

35 biologically-neutral carriers, pharmaceutical compositions to be administered can contain minor 

amounts of non-toxic auxiliary substances, such as wetting or emulsifying agents, preservatives, and 
pH buffering agents and the like, for example sodium acetate or sorbitan monolaurate. 
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1987) and Harlow and Lane (Antibodies, A Laboratory Manual, Cold Spring Harbor Laboratory, 
New York, 1988). 

DNA sequencing: Plasm ids were isolated according to the instructions provided with the 
plasmid purification kit (Qiagen), the sequencing reaction products purified on separation columns as 
5 described by the manufacturer (Princeton Separations) and the nucleotide sequences of EST clones 
determined using a Dye Terminator Cycle Sequencing kit as described by the manufacturer (Perkin 
Elmer). 

Computer analyses: Three different peptide sequences from the human 15 kDa 
selenoprotein were analyzed for matches to the dbEST database of partial cDNA sequences (Boguski 

10 et al., 1993, Nature Genet., 4:332-3) using the BLAST (Altschul et al., 1990, J. MoL Biol. 215:403- 
10 and Altschul et al., 1994, Nature Genet. 6:1 19-29) and gapped BLAST-2 (Altschul & Gish, 1996, 
Methods Enzymol. 266:460-80) programs. Multiple alignments of expressed sequence tag (EST) 
sequences and their translated products were viewed using the MSPcrunch/Blixem system 
(Sonnhammer & Durbin, 1994, Comput. AppL Biosci., 10:301-7). The Blixem alignments also 

15 revealed polymorphic sites in the human ESTs that were clearly distinct from sequencing errors- 
Generation of Polyclonal Antibodies: Polyclonal antibodies which recognize the 15 kDa 
selenoprotein were made using standard procedures (for example Harlow and Lane, Antibodies: A 
laboratory manual, Cold Spring Harbor Laboratory, 1988, Chapter 5). A synthetic peptide fragment 
containing the C-terminal region of Seq. I.D. No. 1 (amino acids 145-162) was conjugated to the 

20 carrier KLH (keyhole limpet hemacyanin) and injected into rabbits. Specificity of the polyclonal 
antisera was determined using Western blotting of the purified recombinant human 15 kDa 
selenoprotein. > 

III. Purification and Characterization of the Human 15 kDa Selenoprotein 
25 The human 15 kDa selenoprotein was detected in and purified from the human Jurkat T-cell 

line, JPX9 (Nagata et al., 1989, J, Virol. 63:3220-6) by growing the ceils in the presence of 75 Se 
followed by analysis of extracts of the 75 Se-Iabeled cells by SDS PAGE and Phosphorlmager 
detection of radioactivity on the gels. One of the major 75 Se-labeled proteins that migrated as a 1 5 
kDa band on SDS PAGE was purified initially on DEAE-Sepharose and phenyl-Sepharose columns, 
30 and then further on a reverse-phase column. The procedures used were as follows. JPX9, was grown 
and labeled with [ 75 Se]selenious acid (2 uCi/ml) as described in Gladyshev et al. ( Proc. Natl. Acad. 
ScL USA 93:6146-51, 1996). 75 Se-labeled JPX9 cells were mixed with unlabeled cells, suspended in 
2 volumes of 30 mM Tris-HCl, pH 7.5, 1 mM EDTA, 2 mM DTT, 1 mM MgCl 2 , 1 mM 
phenylmethylsulfonyl fluoride and disrupted by sonication. Disrupted cells were centrifuged, the 
35 supernatant applied to a DEAE-Sepharose column, which had been equilibrated with 30 mM Tris- 
HCl, pH 7.5, 2 mM DTT and 1 mM EDTA (buffer A), the column washed with 2 volumes of buffer 
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A and proteins eluted by application of a linear gradient from 0 to 500 mM NaCl in buffer A. 
Fractions containing 75 Se were analyzed on SDS gels. Fractions containing the human 15 kDa 
selenoprotein that eluted from the DEAE column with 350 mM NaCl were combined, concentrated, 
adjusted to a concentration of 0.5 M ammonium sulfate in buffer A, applied to a phenyl-Sepharose 
column equilibrated in 1 M ammonium sulfate in buffer A, the column washed by application of a 
linear gradient from 0,5 to 0 M ammonium sulfate in buffer A, and radioactive fractions 
corresponding to the 15 kDa selenoprotein eluted by application of a linear gradient from buffer A to 
water. Radioactive fractions were combined, concentrated, and loaded on a C 18 reverse-phase HPLC 
column that had been equilibrated in 0.05% trifluoroacetic acid (TFA), a gradient of 0 to 60% 
acetonitrile in 0.05% TFA applied and 75 Se-containing fractions corresponding to the 15 kDa 
selenoprotein eluted at 48% acetonitrile. 

Fractions containing the human 15 kDa selenoprotein from the Ci 8 column were dried on a 
Speed- Vac SCI 10 (Savant), dissolved in SDS-PAGE sample buffer and analyzed by SDS-PAGE. 
The molecular mass of the human 15 kDa selenoprotein was determined by electrospray and MALDI 
mass-spectrometry in fractions from the C ls column. Both mass spectra revealed a single strong 
signal of the 15 kDa selenoprotein. The native molecular mass of the 15 kDa selenoprotein purified 
on a DEAE-Sepharose column was determined using native PAGE and analytical HPLC gel filtration 
as described by G lady she v et al. (Biochemistry 35:213-23, 1996). The 15 kDa selenoprotein was 
detected as 75 Se-labeled fractions from a gel-filtration column and as a 75 Se-labeled band on native 
PAGE. 

The molecular mass of the human 15 kDa selenoprotein subunit in fractions from the C 18 
column determined by MALDI mass-spectrometry was 14,830 Da. Electrospray mass-spectrometry 
of the same preparation revealed a molecular mass of 14,870 Da. The N-terminus of the protein was 
blocked which prevented determination of the N-terminal sequence. 

Amino acid analysis of the purified protein (performed by Harvard Microchem, Boston, 
MA), shown in Table 1, reveals a lack of internal methionine and histidine residues, as well as the 
hydrophobic character of the protein. 
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translation. Although selenocysteine was not directly identified as a component of the 15 kDa 
selenoprotein, the labeling of the protein with 73 Se, readthrough of the TGA codon and the location 
of selenocysteine insertion sequence (SECIS) element in the untranslated area (below) suggest the 
presence of selenocysteine in the protein. The predicted ORF encoded a protein of 17,790.6 Da. The 
5 mass of the purified 15 kDa selenoprotein was 14,870 Da, and this discrepancy suggested post- 
translational processing of the protein. Processing of the 15 kDa selenoprotein appears to occur at 
the N-terminal portion of the protein. Since antiserum raised to a synthetic peptide that was identical 
in sequence to the eighteen C-terminal residues of the 15 kDa selenoprotein, it recognized the 15 kDa 
selenoprotein at different stages of purification. In addition, one of the sequenced tryptic peptides 

10 obtained from digests of the 15 kDa selenoprotein corresponded to residues 146-158, located near the 
C-terminus according to the predicted gene sequence. 

The N-terminal portion of the putative precursor of the 15 kDa selenoprotein, as predicted 
from the gene sequence, has a stretch of hydrophobic amino acid residues, suggesting the presence of 
a signal peptide. Cleavage of these N-terminal amino acid residues is consistent with the amino acid 

15 composition of the protein (Table 1), since the processed protein matches more closely the amino 
acid analysis data obtained for the purified 15 kDa selenoprotein than the full size 17 kDa protein. 
One possible site for post-translational processing is Ser27, which coincides with the site of an exon- 
intron junction (not shown), making this residue the evolutionary favorable site for post-translational 
processing. 

20 

V. Homologous mouse, rat, Brugia ma lay 7, Caenorhabditis elegans and rice gene sequences 

Computer sequence analyses of the 15 kDa selenoprotein and its gene sequence revealed no 
homology to known proteins. However, a number of dbEST sequences from mouse, rat, B. malayi, 
25 C elegans and rice showed strong homology in TBLASTN searches with the 15 kDa human protein 
(FIG. 2). 

The amino acid sequence of the mouse protein was deduced from the assembly of 39 
independent partial cDNA sequences in dbEST. In addition, experimental confirmation of the 5' 
region encoding the mouse N-terminal sequence was made from partial cDNAs obtained from the 

30 IMAGE consortium. The C. elegans sequence was assembled from two partial cDNA clones 

(GenBank dbEST accession numbers CI 0051 and C08344) which are identical for an 81 bp region 
of overlap and encode the apparently complete reading frame shown. The partial amino acid 
sequence of the homolog from the filarial nematode, B. malayi, was translated from a single partial 
cDNA (GenBank dbEST accession number AA257328). Two rice partial cDNAs (GenBank dbEST 

35 accession numbers D47693, D478 1 9) covered the translated region shown (in addition, shorter 
segments of similarity to the human sequence were noted in translations further downstream, but 
these were in error-prone regions of mismatch between the two ESTs and are not shown). All 
pairwise alignments were strongly significant, as shown by TBLASTX-2 (Washington University 
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gapped blast, February 1997 release obtained from http://b last. wustl.edu fb last/executables). Typical 
EST pairs gave amino acid gapped E (expect) values (BLOSUM 62 matrix), using the sum statistics 
of Altschul and Gish {Methods Enzymol. 266:460-80, 1996) as follows (with the highest HSP score 
appended in parentheses): human/mouse: 2 x 10* 35 [717]; human/C. elegans: 2 x 10' 20 [252]; 
human/5, malayi: 8 x 10 -12 [228]; C eleganslB. malayi: 8 x 10' 21 [257]; human/rice (including 
multiple short matches for scoring purposes): 1 x 10' 2 [82]. 

Interestingly, although mouse and rat genes encode potential selenocysteine-containing 15 
kDa proteins, the genes in C. elegans and B. malayi encode homologous proteins containing cysteine 
in place of selenocysteine. This is consistent with observations that nematode genes for glutathione 
peroxidase and thioredoxin reductase encode cysteine analogs of mammalian selenocysteine- 
containing proteins. The complete mouse 15 kDa selenoprotein cDNA and amino acid sequences are 
presented in Seq. I.D. Nos. 8 and 9, respectively. 

The regions flanking Sec93 in the human 15 kDa selenoprotein had the highest degree of 
homology among proteins from different organisms, suggesting that the selenocysteine residue is 
located in a putative active center. In other mammalian selenocysteine-containing proteins, in which 
the function is established, the selenocysteine residue is located at the active center and it is essential 
for catalytic activity of the selenoenzyme (Stadtman, 1996, Annu. Rev. Biochem. 65:83). 

VI. Tissue Distribution of the Human 15 kDa Selenoprotein 

Approximately 120 partial human cDNA sequences in dbEST were found to match the 

human 15 kDa selenoprotein DNA sequence (within experimental error or expected frequencies of 

natural polymorphism). This sampling represents a sufficient abundance of independent clones to 

reveal the approximate tissue distribution of expression of this relatively highly-expressed gene 

(expression as mRNA). cDNA libraries from 32 different adult, fetal or embryonic tissues or organs 

were represented in this set of sequences. Table 2 shows the ranked incidence of these clones in 

tissues and organs for which at least one library has two or more independent 15 kDa selenoprotein 

cDNAs in dbEST. 

Clearly, the 15 kDa selenoprotein gene exhibits a very broad spectrum of moderate 
expression in many tissues, and significantly higher levels of mRNA are shown by thyroid, 
parathyroid tumor, prostate and pre-cancerous prostate cells. Expression estimates from dbEST 
library frequencies should be considered to be only semi-quantitative, considering that some libraries 
are normalized and variable levels of tissue contamination may exist. More quantitative 
representative estimates are given by the stringent CGAP (Cancer Gene Anatomy Project) libraries 
-(Strausberg et al., 1997, Nature Genet. 15:415-6) prepared from small numbers of laser- 
microdissected cells, for example the pre-cancerous prostate library CGAP_Pr2 (Krizman et al., 
1 996, Cancer Res. 56:5380-3; Table 2). Irrespective of the quantitative uncertainties, this large body 
of partial cDNA sequence data strongly demonstrates that the 1 5 kDa protein gene is expressed in a 
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wide range of human tissues, with increased levels of mRNA in the thyroid, parathyroid and prostate- 
derived cells. The expression of the mouse analog of the human 15 kDa selenoprotein was examined 
by immunoblot assays in prostate, heart, kidney, spleen, liver and other mouse organs, with the 
highest level observed in prostate, suggesting the expression of both mRNA and the selenoprotein in 
many tissues and cell lines. 

Table 2. Incidence of the human 15 kDa selenoprotein gene expression 

Library Incidence per 10,000 ESTs (numbers/library size) 

Thyroid 19.9 (4/2014) 

Parathyroid tumor (Soares NbHPA) 18.4 (12/6511) 

Prostate pre-cancerous cells (CGAP_Pr2) 1 5.4 (3/1945) 

Prostate 11.2 (2/1792) 

Fetal lung (Soares NbHL19W) 9.8 (9/9145) 

Colon carcinoma (3 libraries) 5.6 (2/3358, 1/2791, 1/956) 

Aorta 4.4 (2/4595) 

Fetal retina (Stratagene 937202) 4.3 (2/46 1 0) 

Jurkat T-cells (2 libraries) 4.3 (2/3534,1/3420) 

Retina (2 libraries) 4.1 (3/8915, 2/3368) 

Neuroepithelium (Stratagene 93723 1) 3.7 (2/5385) 

Colon (Stratagene 937204) 3.3 (3/8974) 

Testis (Soares NHT) 2.9 (4/13657) 

Fetal heart (Soares NbHH 1 9 W) 2.3 (6/25708) 

Germinal B-cells (CGAPJ3CB1) 1.0 (2/19194) 

17 libraries from other tissues, including 3 distinct embryo libraries, contained only a single 15 kDa 
protein cDNA clone and are not tabulated here. For some clones, both 5' and 3' EST sequences are 
present in dbEST: these count as only a single cDNA in these calculations. 

VII. Selenocysteine insertion element sequence 

Studies of the mechanism of selenocysteine incorporation into several eukaryotic 
selenoproteins have implicated related stem-loop structures, located in the mRNA 3'UTR, as essential 
for selenocysteine insertion into proteins at a UGA codon in the coding sequence. The general 
structural features of this SECIS (selenocysteine insertion sequence) element have been deduced 
previously (Low and Berry, 1996, Trends Biochem. Set, 21:203, and Walczak et al., 1996, RNA, 
2:367), based on chemical probe experiments and sequence alignments, as summarized in FIG. 3. 

To locate potential SECIS elements in the 15 kDa selenoprotein mRNAs, the human and 
mouse cDNAs were searched for sequences meeting the following constraints (see FIG. 3A): Helix 
I: at least 4 base pairs; Internal loops: 3-9 nucleotides; Quartet (the non- Watson-Crick base paired 
motif): UGAN (following A in Internal loop) NGAN (following the downstream strand of Helix II); 
Helix II: 9-15 standard base pairs extending the Quartet; Apical loop: 10-20 nucleotides starting 
with AA(A/G). Single base mismatches or bulges were allowed within helices longer than 6 base 
pairs. 

Sequences meeting these stringent criteria were found in both the human and mouse 3'- 
UTRs, ending approximately 60 nucleotides upstream of the poly-A addition signal sequence (FIG. 
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1). FIG. 3B shows these human and mouse sequences aligned with the canonical SECIS element 
(Low and Berry, 1996, Trends Biochem. Sci. 21:203, and Walczak et al., 1996, RNA 2:367) of the 
human glutathione peroxidase 1 (GPX-1) mRNA 3*-UTR. The 15 kDa protein mRNAs exhibit all 
the features known to be necessary in other eukaryotic selenoprotein mRNAs to promote 
selenocysteine insertion. 

VIII. Chromosomal localization of the gene for the 15 kDa Selenoprotein 

Computer analyses revealed the UNIGENE cluster of ESTs (Boguski and Schuler, 1995, 
Nature Genetics 10:369-371) corresponding to the 15 kDa human selenoprotein maps to human 
chromosome 1, at the position 1 17- 123 cM on the human transcript gene map, corresponding 
approximately to lp31 (Schuler et al., 1996, Science 274:540-6). 

IX. Differential Expression of the 15 kDa Selenoprotein Polypeptide and mRNA in 
Cancers 

The expression of the 15 kDa selenoprotein and its mRNA is altered in several mouse and 
human cancers compared to non-cancerous tissues. Variations in the levels of both the polypeptide 
and the mRNA can be detected using standard procedures such as Western blotting (for polypeptide) 
and Northern blotting (mRNA). 

For example, the expression of the 15 kDa selenoprotein was compared in cancerous and 
non-cancerous mouse liver tissues by Western blotting using die polyclonal antibody described 
above. As shown in FIG. 4, equal amounts of protein were loaded on each lane in the following 
order: lanes 1 and 2, wild type, 2.5 months; lanes 3 and 4 - c-myc, 2.5 months; lanes 5 and 6 - c- 
myc/TGFa, 2.5 months; lanes 7 and 8- c-myc/TGFa, 10 months; lanes 9-1 1 - c-myc/TGFa, tumor, 10 
months; lanes 12 and 13 - wild type, 1 month; lanes 14 and 15 - c-myc, I month; lanes 16 and 17 - c- 
myc/TGFa, 1 month; lanes 18 and 19- c-myc/TGFa, 10 months; lanes 20-22 - c-myc/TGFa, tumor, 
10 months. Each sample is from a different mouse. c-myc/TGFa represents a double transgenic 
mouse. The c-myc and c-myc/TGFa mice are models for accelerated hepatocarcinogenesis. 

The levels of the 15 kDa selenoprotein polypeptide were observed to be 3-5 fold lower in 
tumor than in surrounding tissue in livers of c-myc/TGFa transgenic mice (FIG. 4). These mice are 
characterized by elevated production of reactive oxygen species, increased lipid peroxidation and 
significant chromosome abnormalities. Oxidative stress in c-myc/TGFa mice can be reduced by 
supplementation of the diet with vitamin E (V. Factor, personal communication), suggesting that 
selenium may have a similar protective effect. On the other hand, expression of the 15 kDa protein 
was not altered in hepatocarcinomas of c-myc and c-myc/TGFp transgenic mice, for which no 
oxidative stress has been reported. 

Additionally, Western blotting also revealed decreased expression of the 15 kDa 
selenoprotein in prostate cancer cell lines relative to the normal prostate (see FIG. 5). Equal protein 
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amounts were loaded on each lane as follows: lane I - c-myc/TGFa liver, 10 months (matched to the 
sample in lane 2); lane 2 - c-myc/TGFa liver, tumor 10 months; lane 3 - mouse prostate; lane 4 - 
purified human T-cell 15 kDa protein control 1; lane 5 - mouse prostate cancer cell line 1; lane 6 - 
mouse prostate cancer cell line 2; lane 7 - mouse prostate; lane 8 - c-myc/TGFa liver, 10 months 
(matched to the sample in lane 9); lane 9 - c-myc/TGFa liver, 10 months; lane 10 - purified human 
T-cell 15 kDa protein control 2. 

Northern blotting revealed decreased expression of the human 15 kDa selenoprotein mRNA 
in matched samples from lymphoma and ovarian and fallopian tube cancers, and corresponding 
normal lymph node, ovary and fallopian tube (data not shown). 

X. Tumor-Related Variants in the 15 kDa Selenoprotein SECIS Element 



Human EST alignments indicated that a G/A substitution polymorphism or mutation (FIG. 
Q 6) occurred at an apical loop nucleotide of the SECIS element in the 3'-UTR region of the human 15 

;|f 15 kDa selenoprotein cDNA sequence (nucleotide position 1 125). An additional substitution (C/T) 

%l polymorphism was observed at position 8 1 1 (FIG. 6). Sequence analysis of the region containing the 

m polymorphisims for over 100 individuals revealed that the substitution polymorphisms at these two 

%J s variant sites, positions 81 1 and 1 125, were linked to each other with a very high probability. Only 

PI two variations of the polymorphisms were detected. Individuals with a C at position 811 always had 

20 a G at position 1 125 (form 1: C81 1....G1 125, referred to herein as CG), while individuals with a T at 
=0 position 81 1 always had an A at position 1 125 (form 2: T81 1....A1 125, referred to herein as TA). 

Given the critical role that the SECIS element has in incorporation of selenocysteine into 
f=\ proteins, changes in nucleotide 1 125 located in the SECIS element may affect the efficiency of 

CI selenocysteine incorporation in the coding region of the gene, thereby providing a mechanism for 

25 controlling the expression of the 15 kDa selenoprotein in tumor and normal tissues. 

The genotype of the 15 kDa selenoprotein was determined for several individuals. Normal 
and cancerous tissues were collected, as well as blood samples to determine if the genotype of the 
tumor was different from that of non-tumor lymphocytes within the same individual. DNA from the 
blood and tissue samples was isolated using the protocols and procedures included in the Puragene 
30 DNA Isolation Kit (Gentra). The isolated DNA (0. 1-1 .0 ^g) was used as template for Polymerase 
Chain Reaction (PCR) amplification using the GeneAmp PCR Amplification Kit and the following 
primers: forward primer 5'-CAGACTTGCGGTTAATTATG-3' (Seq. I.D. No. 12) and the reverse 
primer 5 '-GCC A AGTATGTATCTG ATCC-3 ' (Seq. I.D. No. 13). The PCR reactions included 0.2 
mM dNTPs, 1 .5 mM MgCI 2 , 0.4 mM each primer and 1.25 units of Taq polymerase and were 
35 incubated for 35 cycles of 85°C for 30 seconds, 45°C for 60 seconds, 72°C for 90 seconds). 

Successful amplification was indicated by the appearance of a DNA band of approximately 400 bp 
on a 1% agarose gel. The resulting PCR product was subjected to primer extension or restriction 
digestion, to determine the genotype of the individual. 
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are from individuals heterozygous (C/T) at position 811. The same analysis is used for FIG. 7C, 
which shows DNA digested with Bfal to identify the nucleotide at position 1 125. Only DNA 
containing a G at position 1 125 will be digested. Lanes containing only the upper band are 
homozygous A (neither strand of DNA cut), lane containing only the lower band are homozygous G 
5 (both strands of DNA cut), while lanes containing both bands are heterozygous G/A at position 1 125. 
The experiments described above verified both the existence of polymorphisms at 
nucleotide positions 81 1 and 1 125 within the 15 kDa selenoprotein gene, and the ability to determine 
an individual's genotype with respect to the 15 kDa selenoprotein gene. Using these methods, the 
correlation between the polymorphisms at positions 811 and 1 125 in the 15 kDa selenoprotein gene 
10 and incidence of cancer, as well as race, was determined. The genetic distribution of alleles was 
analyzed in more than 200 human normal and tumor samples (Table 3). DNA from normal tissue, 
head and neck tumors, and colon tumors was isolated and amplified using PCR as described above 
with primers shown in Seq. LD. Nos. 12 and 13. The PCR product was restriction digested with 
Dral, to determine the nucleotide identity at position 81 1 or Bfal, to determine the nucleotide identity 
Sj 15 at position 1 125. 

The differences in genotype between control and cancer patients was examined, as well as 
the differences between Caucasian and African Americans (blacks or persons of African ancestry). 
^ CG/CG and TA/TA patients are homozygous at positions 811 and 1 125 and CG/TA patients are 

heterozygous at positions 8 1 1 and 1 125. As shown in Table 3, the presence of the substitution 
;II 20 polymorphisms, T substituted for C at position 8 1 1 and A substituted for G at position 1 125, were 

-tip e 

found more often in cancer samples, and is designated as a "cancer" polymorphism. The cancer 
p polymorphism therefore includes both the CG/TA and TA/TA alleles in Table 3. The tendency of 

M the cancer polymorphism to be present in individuals having cancer was observed for the Caucasian 

population, and this observation was statistically significant for the African American population. 

25 Table 3 also demonstrates that the cancer polymorphism is more prevalent in the African American 
population. In addition, an example of loss of heterozygosity has been detected in the sample of 
African American origin. The African American population is known to be at higher risk of prostate 
cancer and dietary selenium (which may increase expression of the 15 kDa selenoprotein) has the 
single most pronounced effect in preventing this particular type of cancer. The high expression of 

30 the 15 kDa protein in prostate tissue correlates with both the chemopreventive effect of selenium in 

the prostate, and the increased risk of prostate cancer in the African American population. Therefore, 
determination of an individual's genotype may be used as an indicator of the need for dietary 
selenium supplementation to inhibit tumor development. 

These data suggest that patients containing the allele with the cancer polymorphism are 

35 more likely to develop cancer. Therefore, this cancer polymorphism may be used as the cancer 
predicting tool for populations at risk for developing certain cancers. 
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Table 3. Genotype analysis of the 15 kDa selenoprotein polymorphisms 




Caucasians 


CG/CG 


CG/TA 


TA/TA 


Normal 


19(58%) 


13(39%) 


1 (3%) 


Head and Neck Cancer 


34 (57%) 


21 (35%) 


5 (8%) 


Colon Cancer 


1 1 (50%) 


9(41%) 


2 (9%) 


Colon cancer patients lymphocytes 


9 (53%) 


. 6 (35%) 


2 (12%) 


African Americans 








Normal 


1 1 (17%) 


37 (59%) 


15 (24%) 


Head and Neck Cancer 


7 (24%) 


1 i (38%) 


11 (38%) 


Colon Cancer 
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XL Expression of Recombinant 15 kDa selenoprotein in E coli. 

The human 15 kDa selenoprotein was expressed in BL2 1(DE3) E. coli in the form of its 
cysteine-for-selenocysteine mutant (T for A substitution at nucleotide position 283), with (FIG. 8A) 
5 and without a His-tag using the pET-21b(+) vector (Novagen). Bacteria were grown in LB media 
with 100 mg/liter ampicillin at 37°C to OD^O.S, then induced with 1 mM IPTG. Growth was 
continued for 3 hours at 37°C after induction. As shown in FIG. 8A (arrow), high-levels of the 
cysteine mutant are expressed 3 hours after IPTG induction. 

In addition, the human 15 kDa selenoprotein was genetically engineered to design a 

10 bacterial selenocysteine insertion sequence element (stem-loop structure downstream of the 

selenocysteine TGA codon), so that selenocysteine would be incorporated into the human 15 kDa 
selenoprotein during its expression in bacteria. (FIGS. 8B and 9). The nucleotide sequence 
downstream of TGA (encoding selenocysteine) was mutated in such a way that the mRNA structure 
would be formed that resembles the mRNA structure in the E. coli formate dehydrogenase H that is 

15 necessary for selenocysteine incorporation (FIG. 9 A). Two different constructs were generated 
(FIGS. 9B and C), containing mutations in the area downstream of TGA. These mutants had a 
protein sequence that was different in either 3 or 4 amino acid residues from the wild type human 1 5 
kDa selenoprotein sequence. 

75 Se-labeling experiments (1 nmol/ml radioactive Na 2 75 Se0 3 (-8 Ci/nmol) was added at the 

20 time of IPTG induction, as described above) demonstrated that the designed mRNA structure 
resulted in selenocysteine incorporation into protein (FIG. 8B). Thus, the recombinant 15 kDa 
selenoproteins will be available for functional studies as described in the examples below (for 
example, generating antibodies as in Example 4). This is the first time any mammalian selenoprotein 
was expressed in bacteria in a form that contains a selenocysteine residue. 

25 

EXAMPLES 

The following examples are illustrative of the scope of the present invention. 
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EXAMPLE 1 
Obtaining 15 kDa Selenoprotein cDNA 
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of example, it is known that mammalian selenium-containing proteins are very difficult to express in 
bacteria, yeast or insect cells. Accordingly, in order to facilitate expression of the protein in these 
cells, a sequence variant may be produced in which the TGA codon (encoding selenocysteine) is 
replaced with a codon encoding cysteine (either TCT or TGT). However, as described above, it is 

5 possible to generate the 15 kDa human selenoprotein without such a mutation. 

Two types of cDNA sequence variant may be produced. In the first type, the variation in 
the cDNA sequence is not manifested as a change in the amino acid sequence of the encoded 
polypeptide. These "silent" variations are simply a reflection of the degeneracy of the genetic code. 
In the second type, the cDNA sequence variation does result in a change in the amino acid sequence 

10 of the encoded protein, such as the U to C variation discussed above. In such cases, the variant 

cDNA sequence produces a variant polypeptide sequence. In order to preserve the functional and 
immunologic identity of the encoded polypeptide, it is preferred that any such amino acid 
substitutions are "conservative." Conservative substitutions replace one amino acid with another 
amino acid that is similar in size, hydrophobicity, etc. Examples of conservative substitutions are 

15 shown in Table 4 below. 





TABLE 4 


Original Residue 


Conservative Substitutions 


Ala 


ser 


Arg 


lys 


Asn 


gin, his 


Asp 


glu 


Cys 


ser 


Gin 


asn 


Glu 


asp 


Gly 


pro 


His 


asn; gin 


He 


leu, va! 


Leu 


ile; val 


Lys 


arg; gin; glu 


Met 


leu; ile 


Phe 


met; leu; tyr 


Ser 


thr 


Thr 


ser 


Trp 


tyr 


Tyr 


trp; phe 


Val 


ile; leu 



Variations in the cDNA sequence that result in amino acid changes, whether 
conservative or not, should be minimized in order to preserve the functional and immunologic 
identity of the encoded protein. The immunologic identity of the protein may be assessed by 
20 determining whether it is recognized by an anti-15 kDa selenoprotein antibody; a variant that is 

recognized by such an antibody is immunologically conserved. Any cDNA sequence variant will 
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preferably introduce no more than 20, and preferably fewer than 10 amino acid substitutions into the 
encoded polypeptide. 

EXAMPLE 2 

5 Obtaining IS kDa Selenoprotein Genomic Genes 

Having provided herein the cDNA sequence of the human and mouse 15 kDa selenoprotein 
cDNAs, cloning of the corresponding genomic nucleotide sequences is now enabled. These genomic 
sequences may readily be obtained by standard laboratory methods, such as RACE-PCR 

10 amplification using a human genomic DNA library or genomic DNA extracted directly from human 
or murine cells as a template. 

Having the intron sequence data for the genomic sequence will be valuable for diagnostic 
applications, e.g., looking for splice-site mutations. The various applications described below (e.g., 
expression of the 15 kDa selenoprotein for use in producing antibodies) are described using a 15 kDa 

15 selenoprotein cDNA sequence, but may also be performed using the corresponding genomic 
sequence. 

EXAMPLE 3 

Expression and Purification of 15 kDa Selenoprotein Polypeptides 

20 With the provision of 15 kDa selenoprotein cDNA sequences, the expression and 

purification of corresponding 15 kDa selenoprotein polypeptides by standard laboratory techniques is 
now enabled. The purified polypeptide may be used for. functional analyses, antibody production 
and patient therapy. Furthermore, the DNA sequence of the 15 kDa selenoprotein cDNA and the 
polymorphic cDNAs disclosed above can be manipulated in studies to understand the expression of 

25 the gene and the function of its product. In this way, the underlying biochemical defect which results 
from mutation or reduced expression of the 15 kDa selenoprotein can be established. The 
polymorphic versions of the 15 kDa selenoprotein cDNA isolated to date and others which may be 
isolated based upon information contained herein, may be studied in order to detect alteration in 
expression patterns in terms of relative quantities, tissue specificity and functional properties of the 

30 encoded 15 kDa selenoprotein. 

As noted above, for expression in prokaryotic, yeast and insect cells, it is possible to use a 
sequence variant in which the TGA codon encoding selenocysteine at position 93 is replaced with a 
codon encoding cysteine (such as TCT or TGT) (for convenience, in the following discussion, this 
variant form of the protein is still referred to as the 15 kDa selenoprotein). Methods for expressing 

35 large amounts of protein from a cloned gene introduced into Escherichia coli (E. coif) may be 

utilized for the purification, localization and functional analysis of proteins. For example, fusion 
proteins consisting of amino terminal peptides encoded by a portion of the E. coli lacZox trpE gene 
linked to the 15 kDa selenoprotein may be used to prepare polyclonal and monoclonal antibodies 
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against the protein. Thereafter, these antibodies may be used to purify proteins by immunoaffinity 
chromatography, in diagnostic assays to quantitate the levels of protein, and to localize proteins in 
tissues and individual cells by immunofluorescence. 

The sequence variant or the native protein may also be produced in E. coli in large amounts 
5 for functional studies. Methods and plasmid vectors for producing fusion proteins and intact native 
proteins in bacteria are described in Sambrook et al. (In Molecular Cloning: A Laboratory Manual, 
Cold Spring Harbor, New York, 1989, chapter 17). Such fusion proteins may be made in large 
amounts, are easy to purify, and can be used to elicit antibody response. Native proteins can be 
produced in bacteria by placing a strong, regulated promoter and an efficient ribosome binding site 

10 upstream of the cloned gene. If low levels of protein are produced, additional steps may be taken to 
increase protein production; if high levels of protein are produced, purification is relatively easy. 
Suitable methods are presented in Sambrook et al. (In Molecular Cloning: A Laboratory Manual, 
Cold Spring Harbor, New York, 1989) and are well known in the art. Often, proteins expressed at 
high levels are found in insoluble inclusion bodies. Methods for extracting proteins from these 

15 aggregates are described by Sambrook et ah (In Molecular Cloning: A Laboratory Manual, Cold 
Spring Harbor, New York, 1989, chapter 17). 

Vector systems suitable for the expression of lacZ fusion genes include the pUR series of 
vectors (Ruther and Muller-Hill, 1983, EMBO J. 2:1791), pEXl-3 (Stanley and Luzio, 1984, EMBO 
J. 3:1429) and pMRlOO (Gray et al., 1982, Proa Natl. Acad Sci. USA 79:6598). Vectors suitable for 

20 the production of intact native proteins include pKC30 (Shimatake and Rosenberg, 1981, Nature 
292:128), pKK177-3 (Amann and Brosius, 1985, Gene 40:183) and pET-3 (Studiar and Moffatt, 
1 986, J. Moi Biol. 189:113). 15 kDa selenoprotein fusion proteins may be isolated from protein 
gels, lyophilized, ground into a powder and used as an antigen. The DNA sequence can also be 
transferred to other cloning vehicles, such as other plasmids, bacteriophages, cosmids, animal viruses 

25 and yeast artificial chromosomes (YACs) (Burke et al., 1987, Science 236:806-12). These vectors 
may then be introduced into a variety of hosts including somatic cells, and simple or complex 
organisms, such as bacteria, fungi (Timberlake and Marshall, 1989, Science 244: 13 13-7), 
invertebrates, plants (Gasser and Fraley, 1989, Science 244: 1293), and mammals (Purse! et al., 1989, 
Science 244:1281-8), which cell or organisms are rendered transgenic by the introduction of the 

30 heterologous 15 kDa selenoprotein cDNA. 

For expression in mammalian cells, the cDNA sequence need not be modified to remove the 
selenocysteine codon. Rather, the 15 kDa selenoprotein cDNA may be directly ligated to 
heterologous promoters, such as the simian virus SV40 promoter in the pSV2 vector (Mulligan and 
Berg, 1981, Proc. Natl. Acad. Sci. USA 78:2072-6), and introduced into cells, such as monkey 

35 COS- 1 cells (Gluzman, 1981, Cell 23: 175-82), to achieve transient or long-term expression. The 
stable integration of the chimeric gene construct may be maintained in mammalian cells by 
biochemical selection, such as neomycin (Southern and Berg, 1982, J. Mol. Appl Genet. 1:327-41) 
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and mycophoenolic acid (Mulligan and Berg, 1981, Proc. Natl. Acad. Sci. USA 78:2072-6). Normal 
mammalian cell growth medium contains sufficient trace selenium to permit efficient expression of 
the 15 kDa selenoprotein (for example, selenium is present in fetal bovine serum). However, the 
growth medium could be enriched if desired by the addition of selenite (Na 2 Se0 3 ). 

DNA sequences can be manipulated with standard procedures such as restriction enzyme 
digestion, fill-in with DNA polymerase, deletion by exonuclease, extension by terminal 
deoxynucleotide transferase, ligation of synthetic or cloned DNA sequences, site-directed sequence- 
alteration via single-stranded bacteriophage intermediate or with the use of specific oligonucleotides 
in combination with PCR. 

The cDNA sequence (or portions derived from it) or a mini gene (a cDNA with an intron 
and its own promoter) may be introduced into eukaryotic expression vectors by conventional 
techniques. These vectors are designed to permit transcription of the cDNA eukaryotic cells by 
providing regulatory sequences that initiate and enhance the transcription of the cDNA and ensure its 
proper splicing and polyadenylation. Vectors containing the promoter and enhancer regions of the 
SV40 or long terminal repeat (LTR) of the Rous Sarcoma virus and polyadenylation and splicing 
signal from SV40 are readily available (Mulligan et al., 1981, Proc. Natl. Acad Set. USA 78:2072-6; 
Gorman et al., 1982, Proc. Natl. Acad. Sci USA 78:6777-81). The level of expression of the cDNA 
can be manipulated with this type of vector, either by using promoters that have different activities, 
for example, the baculovirus pAC373 can express cDNAs at high levels in Spodptera frugiperda 
cells (Summers and Smith, 1985, In: Genetically Altered Viruses and the Environment, Fields et al. 
(Eds.) 22:3 19-28, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, New York) or by using 
vectors that contain promoters amenable to modulation, for example, the glucocorticoid-responsive 
promoter from the mouse mammary tumor virus (Lee et al., 1982, Nature 294:228). The expression 
of the cDNA can be monitored in the recipient ceils 24 to 72 hours after introduction (transient 
expression). 

In addition, some vectors contain selectable markers such as the ggt (Mulligan and Berg, 
1981, Proc. Natl. Acad. Sci. USA 78:2072-6) or neo (Southern and Berg, 1982,7. Mol. AppL Genet. 
1 :327-4 1 ) bacterial genes. These selectable markers permit selection of transfected cells that exhibit 
stable, long-term expression of the vectors (and therefore the cDNA). The vectors can be maintained 
in the cells as episomal, freely replicating entities by using regulatory elements of viruses such as 
papilloma (Sarver et al., 1981, Mol. Cell Biol. 1 :486) or Epstein-Barr (Sugden et al., 1985, Mol Cell 
Biol. 5:4 10). Alternatively, one can also produce cell lines that have integrated the vector into 
genomic DNA. Both of these types of cell lines produce the gene product on a continuous basis. 
One can also produce cell lines that have amplified the number of copies of the vector (and therefore 
of the cDNA as well) to create cell lines that can produce high levels of the gene product (Alt et al., 
1978, J. Biol. Chem. 253:1357). 
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The transfer of DNA into eukaryotic, in particular human or other mammalian cells, is now 
a conventional technique. The vectors are introduced into the recipient cells as pure DNA 
(transfection) by, for example, precipitation with calcium phosphate (Graham and vander Eb, 1973, 
Virology 52:466) or strontium phosphate (Brash et al., 1987, Mol. Cell Biol 7:2013), electroporation 
5 (Neumann et al., 1982, EMBO J 1 :841), lipofection (Feigner et ah, 1987, Proc. Natl. Acad. Sci USA 
84:7413), DEAE dextran (McCuthan et al., 1968, J. Natl Cancer Inst. 41:351), microinjection 
(Mueller et al., 1978, Cell 15:579), protoplast fusion (Schafher, 1980, Proc. Natl. Acad Sci. USA 
77:2163-7), or pellet guns (Klein et al., 1987, Nature 327:70). Alternatively, the cDNA can be 
introduced by infection with virus vectors. Systems are developed that use, for example, retroviruses 
10 (Bernstein et ah, 1985, Gen. Engr'g 7:235), adenoviruses (Ahmad et al., 1986, J. Virol. 57:267), or 
Herpes virus (Spaete et al., 1982, Cell 30:295). 

These eukaryotic expression systems can be used for studies of the 15 kDa selenoprotein 
12 gene and variant forms of this gene, the 15 kDa selenoprotein and variant forms of this protein. Such 

uses include, for example, the identification of regulatory elements located in the 5' region of the 15 
V ; 15 kDa selenoprotein gene on genomic clones that can be isolated from human genomic DNA libraries 

*J 1 using the information contained herein. The eukaryotic expression systems may also be used to 

nJ[ study the function of the normal complete protein, specific portions of the protein, or of naturally 

rtl occurring or artificially produced mutant proteins. 

JL. Using the above techniques, the expression vectors containing the 15 kDa selenoprotein 

&p 20 gene or cDNA sequence or fragments or variants or mutants thereof can be introduced into human 

i ^ cells, mammalian cells from other species or non-mammalian cells as desired. For example, monkey 

]£% COS cells (Gluzman, 1981, Cell 23:175-82) that produce high levels of the SV40 T antigen and 

O permit the replication of vectors containing the SV40 origin of replication may be used. Similarly, 

Chinese hamster ovary (CHO), mouse NIH 3T3 fibroblasts or human fibroblasts or lymphoblasts 
25 may be used. 

Expression of the 15 kDa selenoprotein in eukaryotic cells may be used as a source of 
proteins to raise antibodies. The 1 5 kDa selenoprotein may be extracted following release of the 
protein into the supernatant as described above, or, the cDNA sequence may be incorporated into a 
eukaryotic expression vector and expressed as a chimeric protein with, for example, P-globin. 

30 Antibody to p-globin is thereafter used to purify the chimeric protein. Corresponding protease 

cleavage sites engineered between the p-globin gene and the cDNA are then used to separate the two 
polypeptide fragments from one another after translation. One useful expression vector for 
generating p-globin chimeric proteins is pSG5 (Stratagene). This vector encodes rabbit P-globin. 

The present invention thus includes recombinant vectors comprising the selected DNA of 

35 the DNA sequences of this invention (e.g., the entire 1 5 kDa selenoprotein cDNA) for expression in a 
suitable host. The DNA is operatively linked in the vector to an expression control sequence in the 
recombinant DNA molecule so that the 1 5 kDa selenoprotein can be expressed. The expression 
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amniocentesis samples and autopsy material. Alternatively, the assay may be performed on cDNA 
made from mRNA obtained from a biological sample. The detection of mutations in the 1 5 kDa 
selenoprotein gene may be detected using single-strand conformational polymorphism (SSCP) 
analysis. The detection in the biological sample of either a mutant 15 kDa selenoprotein gene or a 
5 mutant 15 kDa selenoprotein RNA may also be performed by a number of other methodologies 
known in the art, as outlined below. In particular, the presence of the polymorphic form 
C81 1/G1 125 may be detected by such means. 

Generically, methods for detecting polymorphisms in a gene sequence may be performed 
using probes that specifically hybridize to either only the wild-type gene sequence or only a 

10 particular polymorphic form of that sequence. Thus, a method for detecting a polymorphism in a 

human 15 kDa selenoprotein gene, cDNA or RNA in a biological sample, comprises hybridizing the 
sample with a nucleic acid probe under conditions whereby the probe will hybridize to 15 kDa 
selenoprotein gene, cDNA or RNA carrying a specified particular polymorphism, such as T81 1, 
Al 125 or T81 1/A1 125, but not to the other polymorphism of the 15 kDa selenoprotein gene, cDNA 

15 or RNA (C81 1/G1 125). For such purposes, the human "wild-type" sequence is considered to be that 
shown in Seq. I.D. No. 2. 

Another suitable detection technique is the polymerase chain reaction amplification of 
reverse transcribed RNA (RT-PCR) of RNA isolated from lymphocytes followed by direct DNA 
sequence determination of the products. The presence of one or more nucleotide differences between 

20 the obtained sequence and the 1 5 kDa selenoprotein cDNA sequence presented herein, and 
especially, differences in the ORF or SECIS portions of the nucleotide sequence are taken as 
indicative of a potential 15 kDa selenoprotein gene mutation. 

Because of the diploid nature of the human genome, both copies of the 15 kDa 
selenoprotein gene need to be examined to distinguish between heterozygotes and homozygotes. A 

25 person who is heterozygous for a mutant form of the 15 kDa selenoprotein (i.e., having one mutant 
form and one "normal" form) may nevertheless be unaffected by the presence of the mutation. 
Primer extension, or restriction digestion analysis allows for the rapid determination of the genotype 
of an individual, as described above. 

Alternatively, DNA extracted from lymphocytes or other cells may be used directly for 

30 amplification. The direct amplification from genomic DNA would be appropriate for analysis of the 
entire 15 kDa selenoprotein gene including regulatory sequences located upstream and downstream 
from the open reading frame. Reviews of direct DNA diagnosis have been presented by Caskey 
(Science 236:1223-8, 1989) and by Landegren et al. (Science 242:229-37, 1989). 

Further studies of 1 5 kDa selenoprotein genes isolated from cancer patients may reveal 

35 particular mutations/polymorphisms that occur at a high frequency within this population of 

individuals. In this case, rather than sequencing the entire 15 kDa selenoprotein gene, it may be 
possible to design DNA diagnostic methods to specifically detect the most common mutations. 
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The detection of specific DNA mutations may be achieved by methods such as 
hybridization using specific oligonucleotides (Wallace et ah, 1986, Cold Spring Harbor Symp. 
Quant Biol. 51:257-61), direct DNA sequencing (Church and Gilbert, 1988, Proc. Natl Acad. Sci. 
USA 81:1991-5), the use of restriction enzymes (Flavell et al., 1978, Cell 15:25; Geever et al., 1981, 
5 Proc. Nat!. Acad. Sci USA 78:5081), discrimination on the basis of electrophoretic mobility in gels 
with denaturing reagent (Myers and Maniatis, 1986, Cold Spring Harbor Symp. Quant. Biol. 51:275- 
84), RNase protection (Myers et al., 1985, Science 230:1242), chemical cleavage (Cotton et al., 
1985, Proc. Natl. Acad. Sci. USA 85:4397-4401), and the ligase-mediated detection procedure 
(Landegren et al., 1 988, Science 24 1 : 1 077). 
10 By way of example, oligonucleotides specific to normal or mutant sequences may be 

chemically synthesized using commercially available machines, labelled radioactively with isotopes 
(such as 32 P) or non-radioactively with tags such as biotin (Ward and Langer, 1981, Proc. Natl. Acad. 
yjl Sci. USA 78:6633-57), and hybridized to individual DNA samples immobilized on membranes or 

]*jf other solid supports by dot-blot or transfer from gels after electrophoresis. The presence or absence 

S; I 15 of these specific sequences may then be visualized by methods such as autoradiography or 

y\ fluorometric (Landegren, et al., 1989, Science 242:229-37) or colorimetric reactions (Gebeyehu et 

/"f al., 1987, Nuci Acids Res. 15:4513-34). 

6l Sequence differences between normal and mutant forms of that gene may also be revealed 

^ by the direct DNA sequencing method of Church and Gilbert {Proc. Natl. Acad. Sci. USA 81:1991-5, 

afj 20 1988). Cloned DNA segments may be used as probes to detect specific DNA segments. The 

\jZ sensitivity of this method is greatly enhanced when combined with PCR (Wrichnik et al., 1987, 

w 

Q Nucleic Acids Res. 15:529-42; Wong et al., 1987, Nature 330:384-6; Stoflet et al., 1988, Science 

w 239:49 1-4). In this approach, a sequencing primer which lies within the amplified sequence is used 

with double-stranded PCR product or single-stranded template generated by a modified PCR. The 
25 sequence determination is performed by conventional procedures with radiolabeled nucleotides or by 
automatic sequencing procedures with fluorescent tags. 

Sequence alterations may occasionally generate fortuitous restriction enzyme recognition 
sites or may eliminate existing restriction sites. Changes in restriction sites are revealed by the use of 
appropriate enzyme digestion followed by conventional gel-blot hybridization (Southern, 1975, J. 
30 Mol Biol 98:503). DNA fragments carrying the site (either normal or mutant) are detected by their 
reduction in size or increase of corresponding restriction fragment numbers. Genomic DNA samples 
may also be amplified by PCR prior to treatment with the appropriate restriction enzyme; fragments 
of different sizes are then visualized under UV light in the presence of ethidium bromide after gel 
electrophoresis. 

35 Genetic testing based on DNA sequence differences may be achieved by detection of 

alteration in electrophoretic mobility of DNA fragments in gels with or without denaturing reagent. 
Small sequence deletions and insertions can be visualized by high-resolution gel electrophoresis. For 
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example, a PCR product with small deletions is clearly distinguishable from a normal sequence on an 
8% non-denaturing polyacrylamide gel (Nagamine et al., 1989, Am. J. Hum. Genet. 45:337-9). DNA 
fragments of different sequence compositions may be distinguished on denaturing formamide 
gradient gels in which the mobilities of different DNA fragments are retarded in the gel at different 
5 positions according to their specific "partial-melting" temperatures (Myers et al., 1985, Science 

230: 1242). Alternatively, a method of detecting a mutation comprising a single base substitution or 
other small change could be based on differential primer length in a PCR. For example, an invariant 
primer could be used in addition to a primer specific for a mutation. The PCR products of the 
normal and mutant genes can then be differentially detected in aery lam ide gels. 
10 In addition to conventional gel-electrophoresis and blot-hybridization methods, DNA 

fragments may also be visualized by methods where the individual DNA samples are not 
immobilized on membranes. The probe and target sequences may be both in solution, or the probe 
sequence may be immobilized (Saiki et al., 1989, Proc. Nat. Acad. Set. USA 86:6230-4). A variety 
of detection methods, such as autoradiography involving radioisotopes, direct detection of 
15 radioactive decay (in the presence or absence of scintillant), spectrophotometry involving calorigenic 
[*l reactions and fluorometry involved fluorogenic reactions, may be used to identify specific individual 

l&k genotypes. 

^ If more than one mutation is frequently encountered in the 15 kDa selenoprotein gene, a 

:<p™ system capable of detecting such multiple mutations would be desirable. For example, a PCR with 

|| 20 multiple, specific oligonucleotide primers and hybridization probes may be used to identify all 

£1 possible mutations at the same time (Chamberlain et al., 1988, Nucl. Acids Res. 16:1 141-55). The 

P procedure may involve immobilized sequence-specific oligonucleotides probes (Saiki et al., 1989, 

S Proc. Nat. Acad. Sci. USA 86:6230-4). 

One method that is expected to be particularly suitable for detecting mutations in the 15 kDa 
25 selenoprotein gene is the use of high density oligonucleotide arrays (also known as "DNA chips") as 
described by Hacia et al. (Nature Genetics 14:441-7, 1996). 

EXAMPLE 6 

Detection and Quantification of 15 kDa Selenoprotein mRNA and Polypeptide 

30 The compositions of the present invention, including 1 5 kDa selenoprotein-specific 

antibodies and nucleic acid probes and primers, may be used to detect and/or quantify the level of 15 
kDa selenoprotein polypeptide or mRNA in a biological sample. Biological samples suitable for 
analysis include biopsy samples, such as tumor biopsies, and biological fluids containing cellular 
material, such as blood, cerebrospinal fluid and saliva. 

35 Determining and/or quantifying the levels of 15 kD selenoprotein polypeptide and mRNA 

would be useful for detecting reduced levels of the 15 kDa selenoprotein and mRNA which result 
from, for example, mutations in the promoter regions of the 1 5 kDa selenoprotein gene or mutations 
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within the coding region of the gene which produce truncated, non-functional polypeptides. In 
addition, such determinations may provide valuable information about the ability of the cell to 
incorporate selenium into proteins, as well as information about oxidative stress. Abnormally low 
levels of 15 kDa selenoprotein polypeptide or mRNA may be indicative of the presence of cancer; 
5 such measurements may also be useful to measure the efficacy of cancer treatment. 

The determination of reduced 15 kDa selenoprotein polypeptide or mRNA levels would be 
an alternative or supplemental approach to the direct determination of a patient's status by nucleotide 
sequence determination outlined above. The availability of antibodies specific to the 15 kDa 
selenoprotein polypeptide allows the quantitation of cellular 15 kDa selenoprotein polypeptide by 

10 one of a number of immunoassay methods which are well known in the art and are presented in 

Harlow and Lane (Antibodies, A Laboratory Manual, Cold Spring Harbor Laboratory, New York, 
1988). Such methods include antibody capture assays, antigen capture assays and two antigen 
sandwich assays. For certain assays, a detectable label may be conjugated to the antibody. Suitable 
detectable labels include radioactive labels, fluorescent labels and enzymes. Detection and 

15 quantification of 15 kDa selenoprotein mRNA levels in a biological sample may be achieved using 
the probes and primers described above in conjunction with standard laboratory techniques, 
including quantitative RT-PCR and Northern blotting. 

A significant (preferably 50% or greater) reduction in the amount of 15 kDa selenoprotein 
polypeptide in the cells of a subject compared to the amount of 15 kDa selenoprotein polypeptide 

20 found in control ("healthy") cells would be taken as an indication that the subject may be suffering 
from, or at risk from, cancer. 

The present invention also encompasses kits suitable for the detection and quantification of 
15 kDa selenoprotein polypeptide or mRNA in biological specimens. Kits suitable for detecting 
and/or quantifying the polypeptide comprise a container holding a 15 kDa selenoprotein polypeptide- 

25 specific binding agent, such as a monoclonal antibody. In certain embodiments, the antibody may be 
bound to a solid substrate, such as a column or microtiter plate well. In other embodiments, the kit 
may further include a second specific binding agent that specifically binds to either the 15 kDa 
selenoprotein polypeptide, or the first specific binding agent. The second specific binding agent may 
be conjugated with a label molecule that facilitates detection of the second agent when bound to its 

30 target. Suitable label molecules are well known in the art and include enzymes, fluorophores and 

radionuclides. Kits suitable for detecting or quantifying the 15 kDa selenoprotein mRNA comprise a 
container holding one of more nucleic acid primers or probes as provided above. In certain 
embodiments, the nucleic acid probes may be conjugated to a suitable label molecule that facilitates 
detection of the probe when bound to its target. Suitable label molecules are known in the art and 

35 include radionuclides and biotin. 

An alternative approach to detecting and quantifying levels of the 15 kDa selenoprotein in 
cells or in an animal is to use the 75 Se isotope. This may be accomplished by a number of methods, 
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models. Suitable techniques for generating such transgenic animal models include those described in 
U.S. Patent Nos. 5,489,742 ("Transgenic rats and animal models of inflammatory disease"), 
5,489,743 ("Transgenic animal models for thrombocytopenia"), 5,304,489 ("DNA sequences to target 
proteins to the mammary gland for efficient secretion"), 5,476,995 ("Peptide production"), and 
5,487,992 ("Cells and non-human organisms containing predetermined genomic modifications and 
positive-negative selection methods and vectors for making same"), and references cited therein. 

The relationship between the 15 kDa selenoprotein may be further explored by the creation 
of double transgenic mice, transgenic for oncogene sequences as well as nucleic acids that encode the 
15 kDa selenoprotein. In addition, nucleic acids encoding the 15 kDa selenoprotein may be 
introduced into tumor cells, which cells may then be used to study tumorigenesis in laboratory 
animal models, such as mice. 

In addition, conditional gene silencing (targeting) can be used to generate transgenic mice 
(for reviews see Porter, 1998, Trends Genetics, vol. 14; Rajewsky et al., 1996, J. Clin. Invest. 
98:S51-S53). Conditional silencing of a gene allows cells to accumulate prior to the inactivation 
(functional deletion) of the gene. This approach is advantageous for several reasons. If the gene of 
interest is an essential gene, mutations in that gene might be lethal, leaving no mouse to study gene 
function. In addition, this method allows one to generate models of somatically acquired genetic 
diseases, such as most forms of cancer, rather than of inherited ones. The strategy of this method 
utilizes the bacteriophage-derived Cre-lox system. The Cre enzyme recognizes a sequence motif of 
34 bp, called loxP. If a DNA segment is flanked by two loxP sites in the same orientation, Cre excies 
that segment from the DNA, leaving a single loxP site behind. Conditional targeting is accomplished 
by crossing responder mice, carrying the loxP flanked target gene, with regulator mice carrying the 
Cre transgene, which is expressed in a cell-type-specific or inducible manner. 

EXAMPLE 9 
Dietary selenium 

As described above, the present invention describes for the first time the existence of the 15 
kDa selenoprotein, provides evidence of a link between low levels of this protein and cancer, and 
provides methods for determining levels of the 15 kDa selenoprotein. Supplementation of the diet 
with selenium represents one way in which the level of the 15 kDa selenoprotein may be enhanced, 
with the goal of reducing susceptibility to cancer in patients with a predetermined genetic 
susceptibility. 

Thus, the present invention provides a method for enhancing the level of the 15 kDa 
selenoprotein in a mammal, by administering to the mammal a dietary selenium supplement. In one 
embodiment, the method involves a prior determination that the level of 15 kDa selenoprotein in the 
mammal is lower than the measured average for such mammals. Thus, the invention provides a 
method for dietary regulation in which the level of 15 kDa selenoprotein in the cells of a mammal is 
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measured. If the level is below normal, enhancing the endogenous selenium level is enhanced by 
providing selenium supplementation in the diet of the mammal. Such supplementation can take the 
form of an oral supplement, such as the oral administration of 200 ug of selenuium per day, as 
described by Clark et al. {JAMA, 276:1957-63, 1996) 

EXAMPLE 10 
Gene Therapy 

In some embodiments, the present invention relates to a method of treating tumors by 
overexpressing the 15 kDa selenoprotein in cells which have an abnormally low amount of the 15 kDa 
selenoprotein, or in the cells of a patient having a higher risk for cancers associated with low-levels of 
15 kDa selenoprotein. These methods may be accomplished by introducing a gene coding for the 15 
kDa selenoprotein (or a variant thereof) into the person. A general strategy for transferring genes into 
donor cells is disclosed in U.S. Patent No. 5,529,774. Generally, a gene encoding a protein having 
therapeutically desired effects is cloned into a viral expression vector, and that vector is then 
introduced into the target organism. The virus infects the cells, and produces the protein sequence in 
vivo, where it has its desired therapeutic effect. See, for example, Zabner et al. {Ceil 75:207-16, 1993). 

In some of the foregoing examples, it may only be necessary to introduce the genetic or 
protein elements into certain cells or tissues. For example, in the case of benign nevi and psoriasis, 
introducing them into only the skin may be sufficient. However, in some instances (i.e. tumors and 
polycythemia inflammatory fibrosis), it may be more therapeutically effective and simple to treat all 
of the patient's cells, or more broadly disseminate the vector, for example by intravascular 
administration. 

The nucleic acid sequence encoding at least one therapeutic agent is under the control of a 
suitable promoter. Suitable promoters which may be employed include, but are not limited to, the 
gene's native promoter, retroviral LTR promoter, or adenoviral promoters, such as the adenoviral 
major late promoter; the cytomegalovirus (CMV) promoter; the Rous Sarcoma Virus (RSV) promoter; 
inducible promoters, such as the MMTV promoter; the metallothionein promoter; heat shock 
promoters; the albumin promoter; the histone promoter; the (3-actin promoter; TK. promoters; B19 
parvovirus promoters; and the ApoAI promoter. However the scope of the present invention is not 
limited to specific foreign genes or promoters. 

The recombinant nucleic acid can be administered to the animal host by any method which 
allows the recombinant nucleic acid to reach the appropriate cells. These methods include injection, 
infusion, deposition, implantation, or topical administration. Injections can be intradermal or 
subcutaneous. The recombinant nucleic acid can be delivered as part of a viral vector, such as avipox 
viruses, recombinant vaccinia virus, replication-deficient adenovirus strains or poliovirus, or as a non- 
infectious form such as naked DNA or liposome encapsulated DNA. 
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EX AMPLE 11 
Viral Vectors for Gene Therapy 
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therapeutic agent then may be cloned into the adenoviral DNA. The modified adenoviral genome then 
is excised from the adenovirus yeast artificial chromosome in order to be used to generate adenoviral 
vector particles as hereinabove described. 

The adenoviral particles are administered in an amount effective to produce -a therapeutic 
effect in a host. The exact dosage of adenoviral particles to be administered is dependent upon a 
variety of factors, including the age, weight, and sex of the patient to be treated, and the nature and 
extent of the disease or disorder to be treated. The adenoviral particles may be administered as part of 
a preparation having a titer of adenoviral particles of at least 1 x 10 !0 pfu/ml, and in general not 
exceeding 2 x 10 11 pfu/ml. The adenoviral particles may be administered in combination with a 
pharmaceutically acceptable carrier in a volume up to 10 ml. The pharmaceutically acceptable carrier 
may be, for example, a liquid carrier such as a saline solution, protamine sulfate (Elkins-Sinn,, Inc., 
Cherry Hill, N.J.), or Polybrene (Sigma Chemical). 

In another embodiment, the viral vector is a retroviral vector. Examples of retroviral vectors 
which may be employed include, but are not limited to, Moloney Murine Leukemia Virus, spleen 
necrosis virus, and vectors derived from retroviruses such as Rous Sarcoma Virus, Harvey Sarcoma 
Virus, avian leukosis virus, human immunodeficiency virus, myeloproliferative sarcoma virus, and 
mammary tumor virus. The vector is generally a replication defective retrovirus particle. 

Retroviral vectors are useful as agents to effect retroviral-mediated gene transfer into 
eukaryotic cells. Retroviral vectors are generally constructed such that the majority of sequences 
coding for the structural genes of the virus are deleted and replaced by the gene(s) of interest. Most 
often, the structural genes (i.e., gag, pol, and env), are removed from the retroviral backbone using 
genetic engineering techniques known in the art. This may include digestion with the appropriate 
restriction endonuclease or, in some instances, with Bal 3 1 exonuclease to generate fragments 
containing appropriate portions of the packaging signal. 

New genes may be incorporated into proviral backbones in several general ways. In the 
most straightforward constructions, the structural genes of the retrovirus are replaced by a single 
gene which then is transcribed under the control of the viral regulatory sequences within the long 
terminal repeat (LTR). Retroviral vectors have also been constructed which can introduce more than 
one gene into target cells. Usually, in such vectors one gene is under the regulatory control of the 
viral LTR, while the second gene is expressed either off a spliced message or is under the regulation 
of its own, internal promoter. Alternatively, two genes may be expressed from a single promoter by 
the use of an Internal Ribosome Entry Site. 

Having illustrated and described the principles of isolating the human 15 kDa selenoprotein 
cDNA and corresponding gene and its murine homotog, the proteins encoded by these genes and 
modes of use of these biological molecules, it should be apparent to one skilled in the art that the 
invention can be modified in arrangement and detail without departing from such principles. We 
claim all modifications coming within the spirit and scope of the claims presented herein. 



